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Abstract 

Modern  missions  of  government  and  private  organizations  rely  on  computer  networks  to  operate.  As  evidenced  by  sev¬ 
eral  well-publicized  cyber  breaches,  these  missions  are  under  attack.  Several  cyber  defensive  measures  have  been  pro¬ 
posed  to  mitigate  this  threat,  some  are  meant  to  protect  individual  hosts  on  the  network,  and  others  are  designed  to 
protect  the  network  at  large.  From  a  qualitative  perspective,  these  mitigations  seem  to  improve  security,  but  there  is  no 
quantitative  assessment  of  their  effectiveness  with  respect  to  a  complete  network  system  and  a  cyber-supported  mission 
for  which  the  network  exists.  The  purpose  of  this  paper  is  to  examine  network-level  cyber  defensive  mitigations  and 
quantify  their  impact  on  network  security  and  mission  performance.  Testing  such  mitigations  in  an  live  network  environ¬ 
ment  is  generally  not  possible  due  to  the  expense,  and  thus  a  modeling  and  simulation  approach  is  utilized.  Our  approach 
employs  a  modularized  hierarchical  simulation  framework  to  model  a  complete  cyber  system  and  its  relevant  dynamics 
at  multiple  scales.  We  conduct  experiments  that  test  the  effectiveness  of  network-level  mitigations  from  the  perspectives 
of  security  and  mission  performance.  Additionally,  we  introduce  a  novel,  unified  metric  for  mitigation  effectiveness  that 
takes  into  account  both  of  these  perspectives  and  provides  a  single  measurement  that  is  convenient  and  easily  accessible 
to  security  practitioners. 
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I  Introduction 

Cyber  attacks  are  increasing  at  an  alarming  rate.^’^  As 
exhibited  by  a  number  of  high-profile  cyber  breaches, 
the  damage  that  these  attacks  can  cause  is  substantial.  To 
counter  this  threat,  agencies  such  as  the  SANS  Institute, 
Google,  Microsoft,  and  the  Information  Assurance 
Directorate  of  the  National  Security  Agency,  among  oth¬ 
ers,  have  proposed  several  cyber  defensive  mitigations.^ 
Some  of  these  mitigations  are  meant  to  protect  at  the  host 
level  via  security  controls  deployed  on  individual  network 
devices,  while  others  are  designed  to  protect  at  the  network 
level  via  security  controls  deployed  to  the  network  at  large. 
Given  sufficient  resources,  network  administrators  and 
security  practitioners  could  deploy  all  recommended  miti¬ 
gations  to  maximize  a  network’s  security  posture. 
Unfortunately,  the  reality  for  most  practitioners  is  one  in 
which  allotted  security  resources  are  limited,  and  thus  they 


must  choose  mitigations  that  will  provide  the  most  security 
benefit  for  their  network.  Proposed  mitigations,  however, 
have  not  been  quantitatively  assessed  for  effectiveness 
and,  consequently,  are  not  ranked  or  prioritized  in  any 
way.  This  forces  practitioners  to  rely  on  their  own  judge¬ 
ment  to  select  appropriate  mitigations. 

Another  salient  point  is  that  networks  do  not  exist  for 
their  own  sake,  but  rather  exist  to  support  an  organiza¬ 
tional  mission.  This  means  that  practitioners  must  consider 


Massachusetts  Institute  of  Technology  Lincoln  Laboratory,  USA 

Corresponding  author: 

Neal  Wagner,  Massachusetts  Institute  of  Technology  Lincoln  Laboratory, 
244  Wood  Street,  Lexington,  MA  02420,  USA. 

Email:  neal.wagner@ll.mit.edu 


202 


journal  of  Defense  Modeling  and  Simulation:  Applications,  Methodology,  Technology  1 4(3) 


tradeoffs  between  seeurity  and  mission  performance. 
Accordingly,  mitigations  should  be  examined  in  the  con¬ 
text  of  a  complete  network  system  where  both  security 
and  mission  impact  are  taken  into  account. 

It  is  important  to  note  that  cyber  systems,  like  social 
systems,  include  human  actors  and  are  thus  stochastic  in 
nature.  For  this  reason,  a  network  system  under  a  given  set 
of  conditions  (e.g.,  users,  attackers,  and  defensive  mitiga¬ 
tions)  will  not  always  lead  to  the  same  outcome.  Rather, 
the  pairing  of  a  network  system  and  a  set  of  environmental 
conditions  that  affect  it  will  generate  a  distribution  of  out¬ 
comes  that  represent  the  range  of  possible  results.  When 
evaluating  a  mitigation  for  a  particular  network  environ¬ 
ment,  it  is  therefore  necessary  to  execute  numerous  tests 
in  order  to  determine  which  outcomes  are  probable  and 
which  are  improbable.  Conducting  a  large  number  of 
network-scale  tests  in  a  live  environment  requires  signifi¬ 
cant  resources  and  is  generally  infeasible.  We,  thus,  focus 
on  a  modeling  and  simulation  approach  due  to  its  relative 
low  cost. 

This  paper  examines  two  proposed  defensive  mitiga¬ 
tions  designed  to  protect  the  network  at  large  from 
cyber  attack:  (i)  Segregation  of  Networks  and  Functions 
(SNF);  and  (ii)  Limiting  Workstation-to-Workstation 
Communication  (LWC).  The  purpose  is  to  provide  a  quan¬ 
titative  assessment  of  these  network-level  mitigations  in 
the  context  of  a  complete  network  system  and  to  consider 
mitigation  effectiveness  with  respect  to  two  fundamental 
network  concerns,  security  and  mission  impact.  To  this 
end,  a  modularized  hierarchical  simulation  framework  is 
utilized  to  capture  and  integrate  relevant  dynamics  at  the 
sub-net/enclave  and  full  network  scales.  We  also  describe 
a  novel  metric  that  combines  results  for  security  and  mis¬ 
sion  performance  into  a  single  unified  measure  of  mitiga¬ 
tion  effectiveness  that  is  convenient  and  easily  accessible 
to  security  practitioners  and  network  analysts. 

The  rest  of  this  paper  is  organized  as  follows:  Section  2 
discusses  the  current  state  of  the  practice  with  respect  to 
the  use  of  modeling  and  simulation  in  the  cyber  security 
domain;  Section  3  describes  the  network-level  mitigations 
examined;  Section  4  provides  details  of  the  multi-scale 
hierarchical  simulation  model,  including  component  mod¬ 
els  capturing  cyber  threat,  defense,  and  mission;  Section  5 
gives  metrics  quantifying  security  and  mission  impact  and 
describes  our  unified  measure  for  mitigation  effectiveness; 
Section  6  discusses  our  simulation  experiments;  and 
Section  7  concludes. 

2  Domain  characterization 

Cyber  systems  contain  a  mix  of  computerized  processes, 
hardware  entities,  and  human  actors  in  an  environment  that 
is  constantly  shifting.  These  complexities  make  it  difficult 


to  predict  the  effects  that  policy  changes  will  have  on  a 
network  system  and  the  mission  it  is  intended  to  support. 
The  cyber  security  community  is  charged  with  recom¬ 
mending  defensive  measures  to  improve  network  security 
and  mitigate  the  threat  of  cyber  attack.  Currently,  these 
recommendations  are  put  forth  as  security-related  best 
practiees  (e.g.,  see  Microsoft’s  Enterpise  Security  Best 
Practiees^).  It  is  important  to  note  that,  generally,  these 
recommendations  are  made  via  the  judgement  of  subject 
matter  experts  and  are  not  based  on  empirical  analysis  of 
actual  network  tests.  As  mentioned  in  Section  1 ,  the  reason 
for  this  is  that  executing  security-related  tests  at  the  net¬ 
work  scale  is  oftentimes  prohibitively  expensive. 

In  response  to  this  situation,  the  modeling  and  simula¬ 
tion  community  has  generated  a  body  of  work  that  is 
foeused  on  capturing  and  analyzing  network  systems  with 
the  intent  of  improving  their  security.  The  following  sec¬ 
tion  summarizes  the  current  state  of  this  work  and  dis¬ 
cusses  the  contributions  of  this  paper  and  its  place  within 
this  greater  body  of  cyber  modeling  and  simulation 
research. 

2.  /  State  of  the  practice 

A  large  number  of  studies  have  used  modeling  and  simula¬ 
tion  (mod/sim)  as  a  tool  to  improve  the  detection  of  net¬ 
work  intrusions.^"  These  studies  focus  on  network 
situational  awareness  and  use  mod/sim  to  execute  initial 
tests  of  newly  proposed  intrusion  detection  techniques 
before  moving  these  techniques  to  the  prototyping  stage. 

Another  set  of  studies  focuses  on  utilizing  mod/sim  for 
the  purpose  of  investigating  network  security  in  the  context 
of  specific  threats  and  corresponding  defenses.  A  study 
combining  discrete  event  simulation  with  meta-heuristic 
optimization  to  simulate  network  attacks  and  optimize  net¬ 
work  defenses  is  provided  in  Kiesling  et  al."  An  agent- 
based  model  investigating  cooperative  botnet  attacks  and 
corresponding  defenses  is  presented  in  Kotenko  et  al."  A 
Markov  model  is  used  to  simulate  worm  attacks  with  simu¬ 
lation  splitting  techniques  for  efficient  simulation  of  rare 
catastrophie  network  states  in  Masi  et  al."  A  model  built 
using  OMNeT  -b  -b  to  simulate  distributed  denial-of- 
service  attacks  on  networks  is  presented  in  Mina  et  al."  In 
Priest  et  a\}^  an  agent-based  model  is  used  to  evaluate  the 
performance  of  candidate  security  techniques  that  rely  on  a 
moving  target  strategy  to  defend  against  cyber  attack.  In 
Toutonji  et  al."  and  Yu  et  al.^*  epidemiological  models  are 
employed  to  simulate  malware  propagation  over  networks. 
An  agent-based  simulation  examines  the  effectiveness  of 
security  policies  seeking  to  mitigate  the  threat  posed  by 
unauthorized  hardware  on  a  network  in  Wagner  et  al.^^ 

Another  vein  of  research  applies  game  theory  to  model 
the  interactions  between  attack  and  defense.  Some  recent 
examples  include  Clark  et  al.^°  and  Pawlick  et  al.^*  In 
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Clark  et  al.^°  a  game-theoretic  approach  is  applied  to  eval¬ 
uate  network  IP  address  randomization  strategies  for  their 
ability  to  confuse  attackers  trying  to  locate  network 
devices  to  attack.  In  Pawlick  et  al.^^  games  are  used  to 
model  interactions  between  user  devices  and  cloud-based 
systems  that  are  under  attack  and  sometimes  controlled  by 
the  attacker. 

This  paper  utilizes  a  mod/sim  approach  to  examine  the 
effectiveness  of  two  widely  known  network-level  cyber 
defensive  mitigations.  Our  main  contribution  is  to  provide 
a  quantitative  assessment  of  these  mitigations’  effective¬ 
ness  at  the  network  scale  with  respect  to  two  fundamental 
network  concerns:  security  and  mission  impact.  We  also 
use  a  novel  metric  to  combine  results  of  these  concerns 
into  a  single  unified  measure  that  is  easily  accessible  to 
security  analysts  and  practitioners. 

3  Network-level  cyber  defensive 
mitigations 

Cyber  attacks  have  caused  significant  damage  to  enterprise 
networks  in  recent  years.  Quantifying  the  performance  of 
defensive  mitigations  helps  network  administrators  make 
better  decisions  to  improve  the  security  posture  of  networks 
against  attack.  This  paper  examines  two  defensive  mitiga¬ 
tions  that  seek  to  provide  security  at  the  network  level,  SNF 
and  LWC.*  Both  of  these  mitigations  attempt  to  thwart  an 
attacker’s  ability  to  move  within  a  network  after  he/she  has 
gained  initial  entry  to  the  network. 

3.1  SNF  mitigation 

The  SNF  mitigation  is  concerned  with  partitioning  a  net¬ 
work  into  sections  or  segments  to  protect  sensitive  or  valu¬ 
able  resources.  Different  cyber  assets  (e.g.,  hosts,  servers, 
sub-nets)  are  used  for  different  organizational  functions 
(e.g.,  public-facing  web  services,  financial  transactions, 
human  resource  management,  etc.)  having  differing  sensi¬ 
tivity  levels  and  security  requirements.  The  idea  is  to  seg¬ 
regate  these  different  groups  of  cyber  assets  based  on  their 
function  and  restrict  communications  between  the  segre¬ 
gated  groups.  This  is  thought  to  improve  security  by  ham¬ 
pering  the  ability  of  an  attacker,  who  has  already  gained  a 
foothold  on  the  network,  to  traverse  the  network,  spread 
compromise,  and  acquire  further  access  to  sensitive 
resources.  Segregation  is  typically  implemented  by  fire¬ 
walls,  network  egress  and  ingress  filters,  application-level 
filters,  and/or  physical  (hardware)  infrastructure.^^ 

3.2  LWC  mitigation 

The  LWC  mitigation  picks  up  where  SNF  leaves  off  The 
idea  is  to  regulate  communications  at  a  higher  granularity. 


LWC  controls  communications  to  a  greater  extent  than 
SNF,  in  which  even  devices  within  the  same  organiza¬ 
tional  function  may  have  limited  communications  (or  be 
prevented  from  communicating  outright).  Here,  the  goal  is 
to  enforce  the  principle  of  least  privilege  and  to  allow 
communication  privileges  only  when  necessary  for  task 
execution.  LWC  is  implemented  by  setting  device-level 
firewall  rules  (e.g.,  Windows  Firewall  rules),  disabling 
remote  logon  access  to  devices,  and  using  private  virtual 
LANs.^^ 

Both  mitigations  are  about  partitioning  a  network  into 
segments  and  controlling  communications  between  seg¬ 
ments  and  between  segments  and  the  Internet.  We  refer  to 
an  individual  segment  of  a  partitioned  network  as  an 
enclave,  which  is  a  group  of  network  devices  with  homo¬ 
geneous  reachability. 

4  Multi-scale  hierarchical  model 

We  wish  to  quantitatively  assess  the  effectiveness  of  the 
SNF  and  LWC  mitigations  in  the  context  of  a  complete 
network  system.  For  this  purpose  a  multi-scale  model  to 
characterize  dynamics  at  the  enclave  and  network  scales  is 
employed.  The  complete  model  is  modularized  via  a  hier¬ 
archical  framework  in  which  enclave-scale  dynamics  (i.e., 
dynamics  internal  to  a  single  enclave)  are  captured  sepa¬ 
rately  in  a  single  model,  and  simulation  results  from  this 
model  are  then  used  to  inform  a  network-scale  model.  The 
model  is  informed  by  a  proprietary  testbed  environment  in 
which  a  partitioned  network  is  captured  at  a  coarse¬ 
grained  level  of  abstraction  where  only  the  vulnerability 
level  of  individual  network  enclaves  is  measured.  A  gra¬ 
phical  overview  of  the  full  hierarchical  model  is  given  in 
Figure  1.  From  the  figure,  the  enclave  model  is  parameter¬ 
ized  by  outputs  from  testbed  experiments  (see  Section 
4.2).  Simulation  runs  are  executed  on  this  enclave  model, 
results  are  aggregated,  and  these  results  are  used  to  para¬ 
meterize  the  network  model  (right  of  the  figure),  which 
captures  an  abstracted  full  network  system  with  attack/ 


Figure  I.  Multi-scale,  hierarchical  model:  Enclave  model 
captures  dynamics  internal  to  a  single  enclave,  network  model 
captures  network-scale  dynamics  of  security  and  mission 
performance. 
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defense  dynamics  and  mission  users  and  their  associated 
operations  (details  provided  in  Section  4.3). 

With  respect  to  this  study,  the  hierarchical  model  struc¬ 
ture  is  beneficial  for  the  following  reasons: 

•  it  allows  the  model  to  incorporate  data  gleaned 
from  testbed  experiments; 

•  it  provides  for  a  reduction  in  model  implementation 
effort  due  to  the  modularity  gained  by  dividing  the 
simulation  model  into  multiple  components;  and 

•  it  supports  quicker  simulation  execution  times  due 
to  reduced  complexity  at  the  larger  scale  network 
model. 

With  respect  to  future  studies,  this  model  stracture  pro¬ 
vides  a  simulation  framework  that  is  flexible  and  re-usable: 
flexible  because  different  versions  of  one  component 
model  can  be  substituted  into  the  framework  without  hav¬ 
ing  to  change  the  underlying  implementation  of  the  other 
component  model(s);  re-usable  because  a  component 
model  may  be  used  as  part  of  multiple  complete  simulation 
models  with  potentially  little  or  no  modification.  This 
study  takes  advantage  of  the  framework’s  re-usability  by 
re-tooling  a  network  component  model  capturing  mission 
users  and  operations  from  a  previous  study which  itself 
was  re-tooled  from  Priest  et  al.'®  Planned  future  work  will 
take  advantage  of  the  framework’s  flexibility — it  will 
focus  on  developing  a  simulation  model  to  replace  the 
testbed  environment  so  that  more  partitioning  scenarios 
can  be  easily  examined,  as  the  resource  cost  of  executing 
scenarios  on  the  testbed  is  relatively  high.  The  following 
sections  detail  the  components  of  the  complete  model  and 
their  integration. 

4. 1  Testbed  environment 

As  discussed  above,  the  testbed  is  a  proprietary  environ¬ 
ment  that  supports  coarse-grained  tests  of  a  partitioned 
network.  A  partitioning  architecture  that  divides  the  net¬ 
work  into  enclaves  and  restricts  communications  between 
enclaves  and  between  enclaves  and  the  Internet  can  be 
instantiated.  The  environment  modeled  by  the  testbed  is 
depicted  in  Figure  2.  In  this  environment  the  attacker 
residing  on  the  Internet  is  restricted  by  the  partitioning 
architecture  and  can  only  communicate  with  enclaves  as 
allowed  by  the  architecture.  For  example,  as  shown  in  the 
figure,  suppose  a  network  is  partitioned  into  three  enclaves 
where  Enclave  1  is  allowed  communication  with  the 
Internet  and  Enclaves  2  and  3  are  not.  Additionally,  sup¬ 
pose  communications  between  Enclave  1  and  Enclave  3 
are  disallowed  by  the  architecture.  As  displayed  in  the  fig¬ 
ure,  the  attacker  can  penetrate  the  network  only  through 
Enclave  1.  If  the  attacker  is  successful  at  compromising 
Enclave  1  (indicated  by  the  enclave’s  red  color  in  the 


Attacker 


Enclave  3  Enclave  2 


Figure  2.  Testbed  environment:  Partitioning  architecture 
divides  network  into  enclaves  and  restricts  communications 
between  enclaves  and  between  enclaves  and  the  Internet. 
Attacker  resides  on  the  Internet  and  attempts  to  compromise 
enclaves  via  communication  channels.  Defender  periodically 
cleanses  compromised  enclaves. 

figure),  then  he/she  can  attempt  to  spread  to  Enclave  2  via 
the  communication  channel  allowed  by  the  architecture, 
but  cannot  spread  directly  to  Enclave  3  because  the  archi¬ 
tecture  blocks  communications  between  Enclaves  1  and  3. 
The  testbed  environment  specifies  communication  chan¬ 
nels  by  allowing  or  disallowing  software  services  between 
enclaves.  The  environment  also  includes  the  notion  of 
enclave  cleansing  by  the  defender  (depicted  in  the  upper 
right  graphic  of  the  figure):  compromised  enclaves  are 
periodically  cleansed  and  restored  to  an  uncompromised 
state. 

The  testbed  uses  data  from  real  software  vulnerabilities 
and  corresponding  exploits  to  characterize  the  vulnerabil¬ 
ity  level  of  individual  enclaves  with  respect  to  a  given  net¬ 
work  partitioning  architecture  and  enclave-cleansing  rate. 
The  environment  measures  the  probability  that  an  enclave 
has  been  penetrated  but  does  not  capture  instances  of 
actual  device  compromise  within  an  enclave.  This  mea¬ 
surement  informs  the  enclave  component  model  (depicted 
in  Figure  1). 

4.2  Enclave  model 

The  enclave  model  seeks  to  characterize  the  dynamics  of 
attack  and  defense,  at  the  device  level,  within  a  single 
enclave.  The  threat  model  is  that  of  an  attacker  who  pene¬ 
trates  the  enclave  by  compromising  a  single  enclave  device 
and  attempts  to  spread  to  other  enclave  devices.  This  threat 
model  is  depicted  in  Figure  3. 

An  epidemic  model  is  used  to  capture  device-to-device 
infection  spreading  within  an  enclave.  We  utilize  the 
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Algorithm  I :  Enclave  model 


I:  procedure  Enclave (pvuim/S.N)  l>  Pvuin^  probability  enclave  is  vulnerable,  jS:  infection  spread  rate,  N:  no.  of  enclave  devices 

2:  repeat 

3:  0 

4:  dcomp  ^  [empty  set]  >Set  of  compromised  enclave  devices 

5:  duncomp  ^  [all  enclave  devices|  >Set  of  uncompromised  enclave  devices 

6:  r^N[0,  l|  l>r  is  assigned  a  random  value  e[0,l| 

7:  1(0)  ^  I 

8:  while  r  <  pvuin  do  >  Enclave  is  vulnerable,  infection  spread  can  occur 

9:  /(t)  <—  f(l(0),(i,N,t)  > Compute  no.  of  infected  devices  using  Equation  (I) 

10:  if  [dcomp!  <l(t)  then 

I  I :  randomly  remove  /(t)  —  [dcomp!  devices  from  duncomp.  add  to  dcomp 

12:  t^t+l 

13:  r^N[0,l| 

14:  until  Total  timesteps  >  Maximum  timesteps 


-^BctomaT^  Enclave 

Enclave 

^  Attacker  .7^  - 

''  ''  ^ 

''QrV''' 

Initial  Penetration 

Infection  Spread 

Figure  3.  Enclave  threat  model:  attacker  compromises  a  single 
device  and  then  spreads  throughout  enclave. 

propagation  model  proposed  by  Yu  et  al.'*  and  given  by 
Equation  (1): 

/(f)  =  /(0)xe^*  (1) 

where  t  is  time,  7(0)  is  the  number  of  infected  devices  at 
t—0,  p  is  the  infection  propagation  rate,  N  is  the  total 
number  of  devices  in  the  enclave,  and  7(f)  computes  the 
total  number  of  infected  devices  at  time  t.  Initial  enclave 
penetration  is  modeled  as  compromise  of  a  single  device 
(i.e.,7(0)=l  ). 

The  defense  model  is  an  abstraction  of  the  protection 
provided  by  the  network  partitioning  architecture  and 
enclave  cleansing  rate  captured  in  the  testbed  environment 
but  from  the  perspective  of  a  single  enclave.  The  model 
specifies  a  random  variable  to  capture  the  probability  that 
an  enclave  is  in  a  vulnerable  state  (i.e.,  whether  or  not  it 
has  been  penetrated  by  the  attacker).  When  the  enclave  is 
vulnerable,  infection  can  spread  from  device  to  device; 
when  the  enclave  is  not  vulnerable  (i.e.,  it  has  been 
cleansed  by  the  defender),  all  enclave  devices  are  unin¬ 
fected.  The  full  enclave  model  is  given  by  Algorithm  1 . 

In  Algorithm  1,  the  probability  that  an  enclave  is  vul¬ 
nerable,  /»vuin,  is  specified  by  the  output  of  testbed 


experiments  for  the  enclave  being  modeled.  The  model 
generates  three  outputs  that  characterize  the  security  of 
devices  in  the  enclave:  the  expected  number  of  devices 
that  are  compromised  at  any  given  moment,  the  mean 
duration  time  of  compromise  for  devices  when  they  are 
compromised,  and  the  standard  deviation  of  compromise 
duration  times.  As  mentioned  above,  these  outputs  are 
used  to  inform  a  network-scale  model  (depicted  in 
Figure  1  and  detailed  below). 

43  Network  model 

The  network  model  characterizes  the  dynamics  of  attack, 
defense,  and  mission  operations  at  the  scale  of  a  full  net¬ 
work  system.  As  discussed  in  Section  4,  we  leverage  the 
re-usability  of  the  hierarchical  modeling  to  re -tool  and  re¬ 
use  a  network  component  model  which  has  been  used  in 
two  previous  studies.'®’^"' 

Specifically,  we  utilize  the  network-scale  mission  model 
from  these  studies,  which  is  based  on  a  military-style  Air 
Operations  Center  (aoc).  The  aoc  mission  is  tasked  with 
gathering  requests  for  air  operations  and  processing  these 
into  final  flight  plans.  The  mission  model  characterizes  a 
network-supported,  time-sensitive  mission  that  allows  us 
to  examine  a  defensive  mitigation’s  ability  to  protect  the 
mission  from  attack.  Any  delay  to  the  mission’s  comple¬ 
tion  is  undesired.  A  mission  team  involves  three  mission 
users  and  three  database  servers  existing  on  different  net¬ 
work  devices.  We  assume  each  mission  device  has  at  most 
one  mission  role.  The  abstracted  aoc  mission  is  shown  in 
Figure  4,  where  the  mission  users  pass  a  payload  from 
Database  1  to  Database  3.  The  network  includes  N-^  mis¬ 
sion  user  teams  sharing  three  mission  servers,  meaning 
that  there  are  3Am  -F  3  total  mission  devices.  Mission  users 
require  a  fixed  amount  of  uninterrupted  time,  tu,  to  oper¬ 
ate  on  the  payload  before  passing  it  to  the  next  step.  Non¬ 
mission  operations,  such  as  benign  communications  can 
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Algorithm  2:  Network  attack/defense  model 


I:  procedure  ENCLAVElNiT(enc/)  >  Initialization  of  enclave  end 

2:  pdcomp  ^  f>dcomp  fot  cncl  from  global  params  list  >  pdcomp:  probability  of  device  compromise 

3:  tdcomp  ^  tdcomp  cnci  from  global  params  list  l>  tdcomp'-  mean  duration  time  of  device  compromise 

4:  trtjj-opip  ^  fftdcomp  foi"  end  from  global  params  list  >  o'tjcomp:  standard  deviation  of  device  compromise  duration  times 

5:  devices  ^  [all  devices  in  end] 

6:  for  all  device  e  devices  do 

7:  r^N[0, 1]  l>  r  is  assigned  a  random  value  e[0, 1] 

8:  if  r  <  pdcomp  then  l>  device  should  be  marked  as  compromised 

9.  /t  <  tdcomp 

1 0-  tr  < 

II:  comptime  ^  N(fi, a)  l> Compute  compromise  duration  time  for  device 

1 2:  Mark  device  as  compromised  for  time  comptime 

1 3:  procedure  Network  I>  Run  network-scale  attack/defense 
14:  t^O 

1 5:  enclaves  <—  [all  enclaves  in  network] 

16:  for  all  encle  enclaves  do  >  Initialize  devices  in  each  network  enclave 

I  7:  ENCLAVElNIT(end) 

1 8:  repeat 

19:  t^t+l 

20:  for  all  end  e  enclaves  do 

21:  devices  <—  [all  devices  in  end] 

22:  for  all  device  e  devices  do 

23:  if  device  compromise  duration  time  is  complete  then 

24:  Mark  device  as  uncompromised 

25:  if  all  devices  in  end  are  uncompromised  then 

26:  ENCLAVElNiT{end)  >  Re-initiaNze  devices  in  enclave  end 

27:  until  Total  timesteps  >  Maximum  timesteps 


Figure  4.  Abstracted  aoc  Mission  Model:  Three  mission  users 
utilize  three  network  hosts  to  interact  with  three  database 
servers  to  execute  the  mission. 

occur  during  this  time,  but  suffering  a  compromise  to  a 
mission  device  will  delay  the  mission  until  that  device  is 
cleansed  and  restored. 

The  threat/defense  model  is  an  abstraction  of  the  attack 
and  defense  dynamics  captured  in  the  enclave  model  but 
from  a  full  network  perspective  where  attack/defense  out¬ 
comes  vary  depending  on  the  micro-environment  specified 
for  individual  enclaves  in  the  full  network.  The  model  speci¬ 
fies  a  random  variable  to  capture  the  probability  that  a  device 
in  a  given  network  enclave  is  compromised.  At  simulation 
time  t  —  0,  this  variable  is  used  to  determine  which  devices 


in  a  given  enclave  are  compromised  and,  for  those  that  are 
compromised,  a  second  random  variable  determines  the  dura¬ 
tion  of  compromise.  This  initialization  process  is  repeated 
separately  for  each  enclave  of  the  network.  As  simulation 
time  progresses,  compromised  devices  are  cleansed  and 
restored  when  their  compromise  durations  have  completed. 
After  all  compromised  devices  of  an  enclave  have  been 
restored,  the  initialization  process  is  re-executed  on  the 
enclave,  to  set  compromised  devices  and  their  corresponding 
compromise  duration  times.  The  full  model  of  attack  and 
defense  at  the  network  scale  is  given  by  Algorithm  2. 

As  discussed  in  the  previous  section,  outputs  from  the 
enclave  model  are  used  to  inform  the  network  model.  In 
Algorithm  2,  the  probability  of  device  compromise,  the 
mean  compromise  duration  time  for  compromised  devices, 
and  the  standard  deviation  of  compromise  duration  times 
for  a  particular  enclave  are  specified  by  the  output  of 
enclave  model  experiments  for  that  enclave.  Network 
model  outputs  measure  the  overall  security  and  mission 
impact  at  the  network  scale.  The  following  section  pro¬ 
vides  details  on  the  metrics  used  to  measure  these  funda¬ 
mental  network  concerns. 

5  Measuring  mitigation  effectiveness 

In  this  study  we  model  attacks  on  device  availability.  A 
device  becomes  inaccessible  after  a  successful  attack  and 
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remains  inaccessible  until  it  is  cleansed  and  restored.  Our 
goal  is  to  measure  the  effectiveness  of  a  defensive  mitiga¬ 
tion  with  respect  to  system  security  and  mission  impact. 
For  this  purpose,  we  utilize  the  following  two  metrics  as 
in  Wagner  et  al.^"^  and  Priest  et  al.'® 

Definition  1.  System  Security  Index,  Si  The  expected  ratio 
of  device  availability  time  (i.e.,  device  uptime)  to  total 
time,  normalized  to  [0,  1]  : 


Si 


E 


3  X  Nm  +  3  T 
i=\  T 


3  X  A^ni  H“  3 


(2) 


where  ^dow„  =  4mp  +  ^cleanse  i®  the  total  downtime  for 
device  i,  T  is  the  total  simulation  time,  and  Fup  is  the  total 
uptime  for  device  i  (Fup  =  F  -  Fdown,  Tdo™  = 
Ef=  As  discussed  in  Section  4,  a  device  is 

inaccessible  when  it  is  compromised  or  while  it  is  in  the 
process  of  being  cleansed,  and  t^ieanse  represent  the 
total  compromise  and  cleansing  times  for  device  i, 
respectively. 

Definition  2.  Mission  Delay,  The  expected  total  time 
of  device  compromise  t(deiay|comp,  rf)  ond  device  cleansing 
^(deiay|cieanse,  i/)  that  impedes  a  mission. 

As  detailed  in  Section  4.3,  we  model  time-sensitive 
missions  in  which  incurred  delay  is  undesirable.  When  a 
mission-critical  device  d  is  compromised,  the  correspond¬ 
ing  mission  is  delayed  until  the  device  has  been  cleansed. 
Mission  delay  is  computed  as: 


where  tjdeiayicieanse,  rf)  t^c  delay  for  mission  i  due  to 
compromise. 

5.  /  A  unified  metric  to  evaluate  mitigation 
effectiveness 

The  mitigations  given  in  Section  3  exist  to  support  an  orga¬ 
nizational  mission  and  thus  they  should  be  evaluated  in  the 
context  of  the  complete  system  where  the  goal  is  to  maxi¬ 
mize  the  system  security  index  and  to  minimize  the  mis¬ 
sion  delay.  These  metrics  evaluate  different  effects  of  a 
given  mitigation.  It  is  convenient  to  have  a  unified  metric 
to  evaluate  the  effectiveness  of  a  mitigation  and  to  com¬ 
pare  the  effectiveness  of  multiple  mitigations. 

Definition  3.  Unified  Metric,  mg.  It  is  a  measure  to  char¬ 
acterize  the  security  and  mission  delay  (  st  and  from 
Equations  (2)  and  (3),  respectively)  inherent  to  a  simulated 
network  environment  captured  via  Monte  Carlo  experi¬ 
ments.  The  metric  incorporates  effects  of  mean,  median, 
and  variance  of  results  from  Monte  Carlo  simulation  runs, 
normalized  to  [0,1]. 

To  generate  this  metric,  we  first  unify  the  security  index 
(Equation  (2))  for  Monte  Carlo  experiments  of  simulation 
scenarios  with  and  without  a  given  mitigation  as  shown 
below: 


sm=  J  flifl  (Si,  m)) dj/,  M 

‘^noM  =  J  flifl  {Si,  nolVl))  d^/,  noM 


‘^M  ‘^noM 

max(jM,SnoM) 


(6) 


—  ^(delay|comp,  d)  H”  ^(delay|cleanse,  d) 


(3) 


The  expected  total  time  of  mission-impeding  device 
compromise  is  computed  as: 


^(deIay|comp,  d)  —  -^[^(delay|comp,  d)] 


Y^il 


1  ^(delay|comp,  d) 

~M 


(4) 


where  t[(jeiay|comp  d)  delay  for  mission  i  due  to  com¬ 
promise  and  M  is  the  number  of  executed  missions. 

The  expected  total  time  of  mission-impeding  device 
cleansing  is  computed  as: 


^(delay|cleanse,  d) 


(delay|cleanse,  d)\ 


Y^il 


1  ^(delay|cleanse,  d) 

~M 


(5) 


where  and  5/_noM  represent  the  computed  security 
index  values  for  scenario  experiments  with  and  without  the 
given  mitigation,  respectively. /i  is  a  function  f  -.  X  ^f* 
that  takes  an  arbitrary  input  X,  which  might  be  or 
•Si.noM,  and  then  outputs  an  approximation  function  f*.  fi 
is  a  function  ^2  :  f*  -^fi*  that  takes  an  arbitrary  function 
f*,  and  then  maps  into  an  approximation  function^*.  5m 
and  5noM  are  the  approximated  security  index  values  with 
and  without  mitigation,  respectively.  Sg  represents  the  uni¬ 
fied  security  index,  normalized  to  [—1,  -|-  1].  A  computed 
value  for  5g  e  [—  1 ,  0)  signifies  that  the  proposed  mitigation 
decreases  overall  security,  while  a  computed  value 
e  (0,  -h  1]  signifies  the  proposed  mitigation  increases 
overall  security. 

Secondly,  the  mission  delay  (Equation  (3))  for  Monte 
Carlo  experiments  of  scenarios  with  and  without  the  given 
mitigation  (  otm  and  uinoM,  respectively)  is  unified  by  using 
the  following  equation: 
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Figure  5.  Baseline  architecture  and  snf’s  network  partitioning  architecture. 


'Wm  =  /  m))  d? 

^  md„  =  , 

S  max(mnoM.  wm) 

/4(/3(wt/,,noM))d/ 

(7) 

where  mdi^  m  and  wd,,  noM  represent  the  computed  mission 
delay  values  for  scenario  experiments  with  and  without  the 
given  mitigation,  respectively./^  is  a  function /^ 
that  takes  an  arbitrary  input  X,  which  might  be  wd,,  m  or 
mdi^nou,  and  then  outputs  an  approximation  function/^* . 
fn  is  a  function /(  : /j*  -^f^*  that  takes  an  arbitrary  func¬ 
tion  * ,  and  then  maps  into  an  approximation  function 
4* .  otm  and  m^oM  are  the  approximated  mission  delays 
with  and  without  mitigation,  respectively,  mdg  represents 
the  unified  mission  delay,  normalized  to  [—1,  -1-1].  A 
computed  value  for  wig  e  [—  1 ,  0)  signifies  that  the  pro¬ 
posed  mitigation  increases  mission  delay,  while  a  com¬ 
puted  value  for  iWg  e  (0,  -I- 1]  signifies  the  proposed 
mitigation  decreases  mission  delay. 

Finally,  Sg  and  mdg  are  combined  as: 

mg—fs{„l,Sg,W2,mdg)  (8) 

where  /s  is  a  function  fs  :  {w\,  Sg,W2,  mdg}  ^  mg  that 
inputs  user-defined  weighting  factors,  Wi  and  W2,  that  rep¬ 
resent  the  relative  importance  of  security  and  mission 
impact,  respectively,  to  the  user  where  wi  -|->V2=  1,  and 
the  computed  values  of  Sg  and  mdg  and  outputs  the  unified 
performance  measure  mg. 

The  proposed  effectiveness  measure  given  in  Equation 
(8)  combines  si  and  mdg  to  provide  a  unified  metric  for 
effectiveness.  This  metric  can  be  used  to  measure  the 
effectiveness  of  a  single  mitigation  and/or  compare  the 
effectiveness  of  multiple  mitigations.  To  the  best  of  our 
knowledge,  the  measure  given  in  Equation  (8)  represents 
the  first  attempt  to  unify  the  fundamental  network  con¬ 
cerns  of  system  security  and  mission  performance  into  a 


single  metric  that  is  easily  accessible  to  security  practi¬ 
tioners.  The  unified  metric  is  also  used  in  another  study 
conducted  by  the  authors  that  was  submitted  at  the  same 
time  as  this  study. 

6  Experiments 

The  simulation  framework  utilized  in  this  study  is 
designed  to  model  the  environment,  entities,  and  actors  of 
cyber  systems  at  relevant  scales  in  order  to  gain  a  useful 
understanding  of  complex  system  dynamics.  The  goal  is  to 
understand  sub-system  dynamics,  how  these  dynamics 
affect  and  are  affected  by  the  system.  Our  model  utilizes 
this  framework  to  capture  a  full  network  environment 
including  users,  attackers,  defenders,  and  mission  opera¬ 
tions.  The  simulation  framework  is  implemented  using 
NetLogo.^^  Matlab  release^®  2014b  and  Python  2.7  are 
used  for  data  aggregation  across  simulation  runs  and  the 
calculation  of  statistical  measures. 

In  this  section,  we  analyze  and  quantify  the  effective¬ 
ness  of  two  defensive  mitigations:  (i)  SNF  (Section  3.1); 
and  (ii)  LWC  (Section  3.2).  The  goal  of  SNF  and  LWC  is 
to  partition  a  network  into  enclaves  to  restrict  an  attacker’s 
ability  to  move  in  a  network.  As  discussed  in  Section  4,  we 
utilize  a  two-level  hierarchical  model  that  is  informed  by 
the  outputs  of  testbed  experiments.  The  testbed  environ¬ 
ment  is  used  to  test  various  network  partitioning  architec¬ 
tures  as  a  function  of  communications  between  enclaves 
and  between  enclaves  and  the  Internet.  These  communica¬ 
tions  are  abstracted  as  information  flows  via  software  ser¬ 
vices  (depicted  in  Figures  5—7  as  black  lines  connecting 
enclaves/Intemet;  details  provided  in  the  following  sec¬ 
tion).  The  first  level  of  the  modeling  hierarchy,  the  enclave 
model,  is  meant  to  characterize  device-to-device  infection 
spreading  within  a  single  enclave.  The  second  level  is  the 
network  model,  which  is  used  to  capture  system  security 
and  mission  impact.  To  capture  these  two  fundamental 
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(a)  LWC  Architecture  #1 :  Restrictive  w/  separation  of  MAGs*  (b)  LWC  Architecture  #2:  Restrictive  w/  clustering  of  MAGs* 


*40  mission  actor  groups  (MAGs)  in  total  interacting  with  3  mission  servers 


Figure  6.  lwc  architectures  that  include  both  mission  &  non-mission  communications. 


(a)  LWC  Architecture  #3:  Restrictive  w/  separation  of  (b)  LWC  Architecture  #4:  Restrictive  w/  clustering  of 
MAGs,  no  non-mission  communications*  MAGs,  no  non-mission  communications* 


*40  mission  actor  groups  (MAGs)  in  total  interacting  with  3  mission  servers 


Figure  7.  lwc  architectures  that  allow  only  mission  communications. 


concerns  at  the  network  scale,  we  model  a  representative 
network  environment  that  supports  the  aoc  mission 
described  in  Section  4.3. 

6.  /  Testbed  experimental  setup 

As  described  in  Section  4.1,  our  testbed  is  a  proprietary 
environment  that  supports  coarse-grained  tests  of  a  parti¬ 
tioned  network.  Although  network  partitioning  best  prac¬ 
tices  exist,^^  these  provide  only  vague  guidance  and,  thus, 
require  significant  interpretation  by  network  administrators 
to  implement.  Generally,  there  exist  many  possible  ways 
to  implement  network  partitioning,  and  selection  of  an 
optimal  partitioning  architecture  for  a  given  network  envi¬ 
ronment  remains  an  open  problem.  For  this  study,  we  focus 
on  examining  representative  partitioning  architectures  for 
a  mid-sized  organization.  Here,  we  examine  six  partition¬ 
ing  scenarios:  a  baseline  scenario  in  which  no  partitioning 
is  used,  an  architecture  representative  of  SNF,  and  four 
architectures  representative  of  LWC.  All  of  these  scenarios 
include  some  form  of  Internet  connectivity,  which  is  mod¬ 
eled  as  one  or  more  total  services  connecting  the  network 
to  the  Internet.  Figures  5—7  depict  the  baseline,  SNF,  and 


the  four  LWC  scenarios,  respectively.  The  baseline  sce¬ 
nario  (Figure  5(a))  captures  a  network  that  is  unpartitioned 
and  the  SNF  scenario  (Figure  5(b))  captures  a  coarsely- 
partitioned  network  with  four  enclaves  that  represent  cano¬ 
nical  organizational  functions.  Both  of  these  architectures 
are  connected  to  the  Internet  via  10  services,  while  the 
SNF  scenario  uses  a  single  service  to  connect  enclaves  1—3 
to  enclave  4.  As  detailed  in  Section  3.2,  LWC  provides  a 
more  finely-grained  partitioning  architecture  than  SNF 
and,  thus  allows  more  degrees  of  freedom  with  respect  to 
the  number  of  enclaves  used  and  the  restriction  of  commu¬ 
nications  between  these  enclaves.  These  extra  degrees  of 
freedom  mean  there  are  more  possibilities  to  consider 
when  choosing  an  architecture  representative  of  the  LWC 
mitigation.  We,  therefore,  select  four  such  architectures 
depicted  in  Figures  6(a)  and  (b)  and  7(a)  and  (b).  The  first 
two,  depicted  in  Figure  6,  represent  canonical  separation  of 
mission-centric  and  non-mission-centric  communications 
while  the  last  two,  depicted  in  Figure  7,  represent  scenarios 
in  which  only  mission-centric  communications  are  present 
(see  Section  6.3).  Communications  between  enclaves  and 
between  enclaves  and  the  Internet  for  these  LWC  scenarios 
are  specified  by  services  as  shown  in  the  figures. 
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Figure  8.  The  impact  of  infection  rate  per  unit  time,  /J,  on  the  spreading  progress  for  a  vulnerable  network  with  N  =  250  and 
/(0)=l. 


The  output  from  testbed  experiments  provides  the  prob¬ 
ability  of  enclave  vulnerability  for  each  enclave  of  a  cap¬ 
tured  scenario. 

6.2  Enclave  model  experimental  setup 

This  model  captures  device-to-device  infection  spreading 
within  an  enclave.  We  assume  that  an  attacker  penetrates 
an  enclave  by  compromising  a  single  device  and  then 
attempts  to  spread  to  other  devices  in  the  enclave.  As  pre¬ 
sented  in  Section  4.2,  we  use  the  model  given  in  Yu  et 
al.'*  to  mimic  infection  spread  in  an  enclave.  The  model 
inputs  the  probability  of  enclave  vulnerability,  Pvuin,  for 
each  enclave  of  a  captured  scenario  from  testbed  experi¬ 
ments  (see  previous  section).  We  model  a  class-C-sized 
network  with  250  total  hosts/devices. 

Equation  (1)  is  used  to  compute  the  number  of  compro¬ 
mised  nodes  at  a  given  time  where  N  —  250,  I(t  =  0)  =  1 
and  three  threat  regimes  with  respect  to  infection  spread¬ 
ing  are  examined.  These  regimes  represent  a  differing 
severity  of  threat:  (i)  less  aggressive  spreading  (low 
threat);  (ii)  aggressive  spreading  (medium  threat);  and  (iii) 
highly  aggressive  spreading  (high  threat).  As  shown  in 
Figure  8,  the  infection  spread  rate,  /3,  is  varied  to  model 
these  different  severities.  To  account  for  the  uncertainty 
inherent  to  the  spreading  dynamic,  we  sweep  a  range  of  /3 
values  within  each  regime.  Less  aggressive  spreading  is 
given  by  /I  ={2.6  x  10^"^,  2.8  x  lO^'*,  3.0  x  lO^'^}  which 
captures  an  attacker  that  can  infect  less  than  40%  of  a 
class  C  network  in  60  days.  Aggressive  spreading  is  given 
by  /I  ={3.2  X  10^"*,  3.4  x  10^"*,  3.6  x  10^"*}  and  captures 
an  attacker  that  can  infect  up  to  80%  of  the  network  in  60 
days.  Finally,  highly  aggressive  spreading  is  given  by 
yS  =  {3.8  X  10^"^,  4.5  X  10^"^,  5.5  X  10^'*}  and  models  an 
attacker  who  can  infect  the  entire  network  in  less  than  30 
days. 

6.3  Network  model  experimental  setup 

This  model  captures  network-scale  attack/defense 
dynamics  and  mission  operations.  As  discussed  above,  we 


consider  a  representative  class-C-sized  network  with  250 
hosts.  We  model  the  aoc  mission  (described  in  Section 
4.3)  where  the  full  mission  takes  three  days  to  complete  if 
uninterrupted  and  each  of  three  mission  users  requires  one 
day  to  complete  his/her  mission  task.  Simulation  time  is 
specified  such  that  1,000  time  units  =  1  simulated  day. 
We  collect  results  from  1,500  Monte  Carlo  simulation 
runs  in  which  runs  are  terminated  upon  completion  of  all 
missions  or  when  simulation  time  reaches  a  maximum  of 
30,000  time  units  (30  simulated  days). 

As  described  in  Section  6.1,  we  examine  six  partition¬ 
ing  architectures:  a  baseline  architecture  (network  with 
no  partitioning.  Figure  5(a)),  a  representative  SNF  archi¬ 
tecture  (partitioned  with  respect  to  canonical  organiza¬ 
tional  functions.  Figure  5(b)),  and  four  representative 
LWC  architectures  (Figures  6  and  7).  The  LWC  architec¬ 
tures  represent  two  general  scenarios,  one  in  which  a  mix 
of  mission  and  non-mission  communications  are  allowed, 
and  one  in  which  only  mission  communications  are 
allowed. 

Figure  6  depicts  two  representative  architectures  allow¬ 
ing  both  mission  and  non-mission  communications  but 
separate  mission-critical  devices  from  non-mission-critical 
devices.  In  both  of  these  architectures,  enclaves  labeled 
Tfosts  contain  non-mission  devices  and  enclaves  labeled 
magI,  mag2,  etc.  contain  mission-critical  devices  used  by 
mission  actor  teams  or  groups  (abbreviated  as  MAGs  in 
the  figure — see  Section  4.3  for  discussion)  that  communi¬ 
cate  with  mission  servers  in  the  enclave  labeled  Mission 
Servers.  The  difference  between  these  two  architectures  is 
in  how  MAGs  are  separated:  in  LWC  Architecture  #1 
(Figure  6(a))  MAGs  are  completely  separated,  where  each 
MAG  is  contained  in  its  own  enclave,  while  in  LWC 
Architecture  #2  (Figure  6(b))  MAGs  are  clustered  into 
three  MAG-only  enclaves.  Figure  7  depicts  two  representa¬ 
tive  architectures  that  allow  only  mission-critical  devices. 
In  both  of  the  architectures,  the  layer  of  host  enclaves  is 
removed  so  that  only  MAG  and  the  mission  server 
enclaves  remain.  LWC  Architecture  #3  (Figure  7(a))  mir¬ 
rors  Architecture  #1  with  the  Host  enclaves  removed. 
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Figure  9.  snf  simulation  results:  (a)  security  index,  s,  ;  and  (b)  mission  delay,  mj. 


while  LWC  Architecture  #4  (Figure  7(b))  mirrors 
Architecture  #2  also  with  the  Flost  enclaves  removed. 

6.4  Simulation  results 

The  following  sections  present  simulation  results  for  the 
partitioning  scenarios  described  above.  First,  SNF  scenario 
simulation  results  are  compared  to  baseline  scenario  results 
and  then  LWC  scenario  results  are  compared  to  baseline 
scenario  results.  The  purpose  here  is  to  examine  the  rela¬ 
tive  benefit  offered  by  each  mitigation  with  respect  to  the 
baseline  (i.e.,  no  mitigation). 

6.4. 1  SNF  results.  We  compare  two  scenarios  to  analyze 
the  effectiveness  of  the  SNF  mitigation.  A  baseline  sce¬ 
nario  in  which  no  partitioning  is  used,  as  shown  in  Figure 
5(a)  and  a  representative  SNF  architecture  in  which  the 
entire  network  is  subdivided  into  four  enclaves  as  shown 
in  Figure  5(b). 

Simulations  are  executed  for  both  scenarios  and  output 
metrics  are  computed.  Figure  9  shows  the  computed 
metrics  s,-  and  of  Equations  (2)  and  (3),  respectively, 
visualized  as  statistical  box  plots  depicting  the  median 
metric  level  (red  line  in  the  figure)  and  the  variance  of 
computed  values  over  the  1,500  Monte  Carlo  experiments. 
As  seen  in  Figure  9(a),  the  SNF  partitioning  architecture 
has  a  higher  s,-  on  average,  with  a  mean  expected  ratio  of 
device  uptime  of  0.968  as  opposed  to  0.852  for  the  base¬ 
line  case.  Furthermore,  results  also  indicate  that  there  is 
significantly  less  variance  in  security  performance  for  the 
SNF  architecture,  with  a  standard  deviation  of  0.001  as 
opposed  to  0.157  for  the  baseline  architecture.  This  result 
is  compelling  as  it  is  a  reduction  in  variance  of  two  orders 
of  magnitude.  Figure  9(b)  shows  significant  improvements 
to  mission  impact.  The  snf  architecture  gives  both  lower 
average  mission  delay  and  lower  variance  in  mission 
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Figure  lO.  lwc  simulation  results  for  security  index,  Sj. 

performance.  The  average  mission  delay  for  the  snf  archi¬ 
tecture  is  378  time  units  as  opposed  to  9,030  time  units 
for  the  baseline  architecture  (an  order-of-magnitude  differ¬ 
ence)  while  the  standard  deviation  is  1 ,  27 1  time  units  for 
SNF  as  opposed  to  13,701  time  units  for  the  baseline  (also 
an  order-of-magnitude  difference). 

6.4.2  LWC  results.  To  analyze  the  effectiveness  of  the 
LWC  mitigation,  we  compare  five  scenarios:  the  baseline 
architecture  (Figure  5(a))  and  four  representative  LWC 
architectures  (Figures  6  and  7).  Simulations  are  executed 
for  all  five  scenarios  and  output  metrics  are  computed. 
Results  are  given  in  Figures  10  and  1 1 . 

As  seen  in  Figure  10,  results  for  the  security  index,  5,, 
show  that  LWC  yields  marked  improvements  to  security. 
All  of  the  LWC  architectures  give  significantly  less  var¬ 
iance  in  security  performance  relative  to  the  baseline  sce¬ 
nario.  Two  of  the  four  of  the  LWC  architectures  give 
higher  average  security  performance  relative  to  the  base¬ 
line  architecture,  while  for  the  other  two  architectures,  the 
average  security  is  lower  but  comparable  to  that  of  the 
baseline.  The  architecture  with  the  best  result,  lwc 
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takes  the  approximated  security  index  and  mission  delay 
values  and  combines  them  with  the  importance  factors  to 
generate  single  evaluation  value.  To  incorporate  both  mean 
and  variance  into  mg,  we  use  the  following  functions: 
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Figure  I  I.  lwc  simulation  results  for  mission  delay,  mj. 

Architecture  #4,  yields  a  mean  expected  ratio  of  device 
uptime  of  0.972  as  opposed  to  0.852  for  the  baseline  archi¬ 
tecture.  The  four  lwc  architectures  yield  a  variance  in 
security  performance  ranging  from  0.027  to  0.001  com¬ 
pared  to  0.157  for  the  baseline  architecture.  This  is  a  com¬ 
pelling  result  as  it  indicates  a  reduction  in  variance  of  one 
to  two  orders  of  magnitude  relative  to  the  baseline. 

As  given  in  Figure  11,  simulation  results  show  that  lwc 
yields  noticeable  improvements  in  mission  impact.  All  lwc 
architectures  have  lower  average  mission  delay  and  lower 
variance  in  mission  delay  relative  to  the  baseline  architec¬ 
ture.  Mean  mission  delay  ranges  from  2,  419  to  413  time 
units  for  the  lwc  architectures,  while  for  the  baseline  it  is 
9,030  time  units.  Variance  in  mission  delay  ranges  from 
2, 100  to  650  time  units  for  the  lwc  architectures  compared 
to  the  baseline,  which  is  13,  701  time  units.  These  results 
are  also  quite  compelling  as  they  indicate  an  improvement 
of  approximately  one  order  of  magnitude  for  mean  mission 
delay  and  one  to  two  orders  of  magnitude  in  mission  delay 
variance. 

It  is  important  to  note  that  reported  results  for  the  base¬ 
line  are  affected  by  the  maximum  simulation  running  time 
(30,000  time  units).  For  many  simulation  runs  of  the  base¬ 
line  architecture,  missions  did  not  complete  before  this 
time  limit  and  mission  delay  was  therefore  computed  as 
the  maximum  value  of  30,000.  Thus,  the  difference  in  mis¬ 
sion  performance  between  the  examined  mitigation  scenar¬ 
ios  (SNF  and  LWC)  and  the  baseline  scenario  may  be  even 
more  striking  than  reported  here. 

6.5  Unified  metric  computation 

In  Section  5.1,  we  introduced  a  unified  measure  for  com¬ 
paring  the  effectiveness  of  various  mitigations.  The  goal  of 
mg  is  to  provide  a  single  measure  for  effectiveness  as 
explained  in  Definition  3.  The  proposed  approximation 
functions  f\,f2,h  given  in  Equations  (6)  and  (7)  are 

general  functions  mapping  simulation  experiment  results 
into  functions  that  are  integrable.  fs  shown  in  Equation  (8) 


A  =  ffi  Ni  X  (mdi^  M  V  mdi^  „om)  ^  ©  mdt  =  Mdk 
;=i  k=\ 

A  =  i  E  ;) 

i=  1 

X  X  X)  Mdk,i,  (MA.,)  X  X)  | 

f^  =  mg  =  0.5  X  5^  +  0.5  X  mdg 

(9) 

where  ©  “  [V,  represents  all  possible  histograms  and  n  is 
the  total  number  of  measurements.  f\  and  take  the  secu¬ 
rity  index  and  the  mission  delay  simulation  results  for  the 
no  mitigation,  SNF,  and  LWC  scenarios  and  map  them 
into  histograms,  fz  and  f\  fit  the  Gaussian  distribution  to 
each  newly  created  histogram  and  then  multiply  by  the 
mean  of  each  measurement.  ^5  is  a  linear  function  aver¬ 
aging  both  the  enhanced  security  index  and  mission  delay. 

Figure  12  presents  Sj,  results  (shown  in  black)  with  the 
Normal  distribution  approximation  function  (shown  in 
red)  for  all  scenarios.  The  histogram  approximations  are 
obtained  by  applying  f\  (see  Equation  (9))  on  the  simula¬ 
tion  outputs.  f2  presented  in  Equation  (9)  is  another 
approximation  function  that  takes  histogram  approxima¬ 
tions  and  fits  these  to  the  Normal  distribution. 

Note  that,  as  shown  in  Figure  12,  simulation  runs  for 
LWC  Architectures  #1  and  #2  exhibit  bimodal  distribu¬ 
tions  for  Sk-  This  is  due  to  the  extra  layer  of  Host  enclaves 
specified  by  the  architectures  in  Figure  6:  attacks  that  suc¬ 
cessfully  penetrate  the  network  and  make  it  past  the  layer 
of  Host  enclaves  cause  a  noticeable  drop  in  overall  secu¬ 
rity  performance,  while  attacks  that  do  not  make  it  past 
this  first  layer  of  partitioning  result  in  noticeably  better 
overall  security.  These  outcomes  specify  the  two  modes  of 
the  distribution. 

Now,  we  calculate  the  area  under  each  red  curve  to 
compute  the  approximated  security  index  for  each  archi¬ 
tecture  shown  in  Figures  5—7. 

Note  that  the  Normal  distribution  is  a  symmetric  curve. 
To  reward  the  simulation  results  at  the  right  side  of  the 
curve  (close  to  1  for  5,),  the  enhanced  security  index  in 
Equation  (6)  can  be  adjusted  with  the  mean  of  each  experi¬ 
ment  and  calculated  as: 
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I  Histogram  of  simulation  run  (Output  for  for  baseline,  SNF  and  LCW) 
—  .  Gaussian  fit  for  the  histogram  (Output  for  4  for  baseline,  SNF  and  LCW) 


Figure  1 2.  (Equation  (9))  results  for  all  scenarios  (axes  are  displayed  with  differing  scales  to  improve  readability). 


Table  I.  nig  for  all  scenarios. 


Scenario 

% 

mdg 

mg 

SNF 

0.322 

0.944 

0.63 

LWC  1 

0.037 

0.657 

0.34 

LWC  2 

0.185 

0.951 

0.56 

LWC  3 

0.13 

0.804 

0.46 

LWC  4 

0.318 

0.928 

0.62 

(■^M  ^  “^noM  ^  ^(‘^nolVl))  / 1  a\ 

Sg=  -  (10) 

max  (5m  X  E(5m),  ^noM  X  E(5noM)) 

Based  on  Equation  (10),  the  enhanced  security  indexes  for 
all  scenarios  are  shown  in  Table  1. 

Figure  13  presents  Mdk  results  (shown  in  blue)  with  the 
Normal  distribution  approximation  function  (shown  in 
red)  for  all  scenarios.  The  histogram  approximations  of 
each  mitigation’s  mission  delay  results  are  obtained  by 
applying  /3  (see  Equation  (9))  on  the  results.  f/\  presented 
in  Equation  (9)  is  another  approximation  function  taking 
histogram  approximations  and  fitting  these  to  the  Normal 
distribution.  As  can  be  seen  in  the  figure,  many  runs  of  the 
baseline  architecture  have  a  mission  delay  of  30,000  time 
units.  This  means  these  missions  were  not  completed 
before  the  maximum  simulation  run  time.  When  we  run 
our  experiments  with  a  larger  maximum  run  time,  the 
number  of  uncompleted  missions  decreases;  however,  this 
does  not  result  in  significant  changes  to  the  computed 
value  of/,*  (detailed  in  Section  5.1). 


It  is  also  interesting  to  note  that  simulation  runs  for 
LWC  Architectures  #1  and  #2,  in  Figure  13,  do  not  exhibit 
bimodal  distributions  for  Md^,  as  is  seen  for  the  Sk  results 
(Figure  12)  for  these  same  architectures.  This  is  due  to  the 
uncertainty  inherent  to  mission  operations  and  attacks  on 
the  mission.  When  an  attack  manages  to  penetrate  the  net¬ 
work  and  get  past  the  first  layer  of  Flost  enclaves  for  these 
architectures,  it  is  still  not  certain  it  will  be  able  to  nega¬ 
tively  impact  the  mission.  Due  to  chance,  the  attack  may 
compromise  mission-critical  devices  that  have  already 
completed  their  mission  operations  and,  thus,  no  mission 
delay  will  result.  This  dynamic  prevents  the  distribution 
from  being  bimodal. 

Due  to  the  similar  symmetry  of  the  Normal  distribution, 
we  also  add  a  reward  factor  into  Equation  (7)  and  the 
enhanced  mission  delay  is  calculated  as: 

(ot„oM  X  E(fflnoM)  -  ffiM  X  E(iWm))  . ,  , , 

md„  = -  (11) 

max  ((ffJnoM  X  E(»inoM),  mu  X  E(otm))) 

Based  on  Equation  (11),  the  enhanced  mission  delays  for 
all  scenarios  are  given  in  Table  1. 

Assume  that  Sg  and  mdg  are  equally  important  concerns 
with  respect  to  mitigation  effectiveness  and,  thus,  w,  and 
W2  of  Equation  (8)  are  both  specified  as  0.5.  The  unified 
performance  metric  irig  for  SNF  and  LWC  mitigations  are 
shown  in  Table  1.  From  a  practical  standpoint,  network 
administrators  can  view  these  results  as  a  measurement  of 
the  gain  in  effectiveness  at  the  network  scale  due  to  the 
mitigation.  From  the  table,  SNF  yields  a  63%  gain  in 
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Figure  1 3.  Aldk  (Equation  (9))  results  for  all  scenarios  (axes  are  displayed  with  differing  scales  for  readability). 


effectiveness  while  LWC  yields  gains  ranging  from  34% 
to  62%  depending  on  the  architecture  implemented.  These 
results  indicate  that  both  the  SNF  and  LWC  mitigations 
offer  significant  benefits  to  the  security  posture  of  a  net¬ 
work.  Also,  the  range  of  results  seen  for  the  four  LWC 
architectures  examined,  show  that  although  LWC  has  the 
potential  to  be  a  highly  effective  defensive  mitigation,  it 
can  be  quite  difficult  to  select  an  appropriate  partitioning 
architecture  to  implement  this  mitigation.  Thus,  network 
administrators  should  exercise  caution  when  deploying 
LWC,  as  sub-optimal  selection  of  partitioning  architecture 
can  have  deleterious  results  on  network  security.  It  is  also 
important  to  note  that  defensive  mitigations  are  not  meant 
to  be  used  in  isolation,  but  rather  in  combination  as  part  of 
a  layered  defense.  Thus  when  utilized  as  part  of  a  greater 
defensive  policy,  SNF  and  LWC  can  provide  great  contri¬ 
butions  to  the  security  posture  of  a  network  against  attack. 

7  Conclusion 

This  paper  presents  a  multi-scale  hierarchical  simulation 
model  designed  to  evaluate  two  well-known  network-level 
cyber  defense  mitigations,  SNF  and  LWC.  We  quantify 
the  network-scale  effects  of  these  mitigations  from  the 
perspectives  of  security  and  mission  impact.  Experimental 
results  indicate  that  both  mitigations  provide  significant 
benefits  to  the  security  posture  of  a  network,  with  the 
caveat  that  LWC  results  can  vary  widely  due  to  the  extra 
degrees  of  freedom  involved  in  selecting  an  appropriate 
architecture  to  implement  it.  We  also  introduce  a  novel 


metric  that  combines  results  for  security  and  mission  per¬ 
formance  into  a  single  unified  measure  of  mitigation  effec¬ 
tiveness  that  is  convenient  and  easily  accessible  to  security 
practitioners.  This  measure  can  be  viewed  by  practitioners 
as  a  quantification  of  the  gain  in  effectiveness  at  the  net¬ 
work  scale  due  to  a  defensive  mitigation. 

Future  work  is  focused  on  developing  a  simulation 
model  to  replace  the  testbed,  so  that  more  partitioning  sce¬ 
narios  can  be  easily  examined,  as  the  resource  cost  of 
executing  scenarios  on  the  testbed  is  relatively  high.  We 
also  plan  to  test  the  inclusion  of  new  functions  to  improve 
our  unified  measure. 
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